Skip to content

[IR] Don't skip catchswitch when calling BB::getFirstNonPHIOrDbgOrAlloca() #136056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

tylanphear
Copy link
Contributor

@tylanphear tylanphear commented Apr 16, 2025

BasicBlock::getFirstNonPHIOrDbgOrAlloca() is supposed to yield the first instruction in the block which is not a PHI or dbg or alloca instruction. To that effect, it also skips an EHPad instruction which might be present after any PHIs. Normally, the latter is fine, but it has an unexpected interaction with catchswitch blocks:

bb:
    ; optional PHIs
    %0 = catchswitch within none [label %catch] unwind to caller

If we skip over the catchswitch instruction in the previous block, we'll return the end iterator, despite catchswitch not being a PHI or dbg or alloca instruction.

This patch amends the logic to not skip an EHPad instruction which is also a terminator. This means that calling
getFirstNonPHIOrDbgOrAlloca() on a non-empty block with a terminator (such as the above) will always yield a valid instruction iterator. This appears to be the expected behavior anyways, as most usages of this method do not check if the returned iterator is end().

Closes #136048.

`getFirstNonPHIOrDbgOrAlloca()`

`BasicBlock::getFirstNonPHIOrDbgOrAlloca()` is supposed to yield the
first instruction in the block which is not a PHI or dbg or alloca
instruction. To that effect, it also skips an EHPad instruction which
might be present after any PHIs. Normally, the latter is fine, but it
has an unexpected interaction with `catchswitch` blocks:

```
bb:
    ; optional PHIs
    %0 = catchswitch within none [label %catch] unwind to caller
```

If we skip over the `catchswitch` instruction in the previous block,
we'll return the end iterator, despite `catchswitch` not being a PHI or
dbg or alloca instruction.

This patch amends the logic to not skip an EHPad instruction which is
also a terminator. This means that calling
`getFirstNonPHIOrDbgOrAlloca()` on a non-empty block with a terminator
(such as the above) will always yield a valid instruction iterator. This
appears to be the expected behavior anyways, as most usages of this
method do not check that the returned iterator is `end()`.

Closes llvm#136048.
@llvmbot
Copy link
Member

llvmbot commented Apr 16, 2025

@llvm/pr-subscribers-llvm-ir

Author: Tyler Lanphear (tylanphear)

Changes

BasicBlock::getFirstNonPHIOrDbgOrAlloca() is supposed to yield the first instruction in the block which is not a PHI or dbg or alloca instruction. To that effect, it also skips an EHPad instruction which might be present after any PHIs. Normally, the latter is fine, but it has an unexpected interaction with catchswitch blocks:

bb:
    ; optional PHIs
    %0 = catchswitch within none [label %catch] unwind to caller

If we skip over the catchswitch instruction in the previous block, we'll return the end iterator, despite catchswitch not being a PHI or dbg or alloca instruction.

This patch amends the logic to not skip an EHPad instruction which is also a terminator. This means that calling
getFirstNonPHIOrDbgOrAlloca() on a non-empty block with a terminator (such as the above) will always yield a valid instruction iterator. This appears to be the expected behavior anyways, as most usages of this method do not check that the returned iterator is end().

Closes #136048.


Full diff: https://github.com/llvm/llvm-project/pull/136056.diff

1 Files Affected:

  • (modified) llvm/lib/IR/BasicBlock.cpp (+1-1)
diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp
index c632b1b2dc2ab..f870fb53307ea 100644
--- a/llvm/lib/IR/BasicBlock.cpp
+++ b/llvm/lib/IR/BasicBlock.cpp
@@ -455,7 +455,7 @@ BasicBlock::const_iterator BasicBlock::getFirstNonPHIOrDbgOrAlloca() const {
   if (InsertPt == end())
     return end();
 
-  if (InsertPt->isEHPad())
+  if (InsertPt->isEHPad() && !InsertPt->isTerminator())
     ++InsertPt;
 
   if (isEntryBlock()) {

@tylanphear tylanphear changed the title [IR] Don't skip catchswitch instructions when calling getFirstNonPHIOrDbgOrAlloca() [IR] Don't skip catchswitch when calling BB::getFirstNonPHIOrDbgOrAlloca() Apr 16, 2025
@majnemer
Copy link
Contributor

Interesting bug! This is exposing a conflict between getFirstNonPHIOrDbgOrAlloca and one of its uses cases: scanning "interesting" instructions in a basic block.

However changing this is also in conflict with other use cases:

  • Finding an insertion point where an instruction could go with a special case for the entry block wrt AllocaInst
  • Per its documentation: "Returns an iterator to the first instruction in this block that is not a PHINode, a debug intrinsic, a static alloca or any pseudo operation."

catchswitch is definitely a pseudo operation.

I would not recommend changing this functions behavior to deviate from the documentation for two reasons:

  • Perhaps there is code which uses the return result of end() to indicate that they should not try to do something with this BB.
  • Having getFirstInsertionPt behave differently from getFirstNonPHIOrDbgOrAlloca in this way is surprising.

IMO, a cleaner fix would be to change CheckForNonVecCallsInSameBlock so that it took iterators instead of Instruction*. Have you considered something along those lines?

@tylanphear
Copy link
Contributor Author

tylanphear commented Apr 17, 2025

@majnemer I see. I'm not overly familiar with catchswitch and friends, so it wasn't immediately clear to me that it should be considered a pseudo instruction. Its semantics actually seem a bit strange, and it looks like we have to have some special logic elsewhere in SLPVectorizer to handle its presence (i.e. SLPVectorizer.cpp:9708)

I think it's possible to make the modification to CheckForNonVecCallsInSameBlock as you describe, though doing so cleanly seems non-trivial to do, and I'm not sure if that's the right approach. Perhaps @alexey-bataev would have some opinion?

@majnemer
Copy link
Contributor

Another way to go would be to do something like:

      if (auto It = BB->getFirstNonPHIOrDbgOrAlloca();
          It != BB->end() && 
          !CheckForNonVecCallsInSameBlock(&*It,
                                          BB->getTerminator()))
        return Res;

This is consistent with the existing code as there are no call instructions and thus the condition is vacuously maintained.

@tylanphear
Copy link
Contributor Author

That was my first instinct, though I do wonder: should the condition be It == BB->end() || !CheckForNonVecCallsInSameBlock(&*It, BB->getTerminator() instead? I admit, I'm not following the logic in this function well.

Probably the other use of getFirstNonPHIOrDbgOrAlloca() further down in the same function needs to be guarded as well.

@majnemer
Copy link
Contributor

My thinking was that the behavior of the caller should be as-if it just passed in a basic block which just contained a terminator like br. There are no call-sites in such a basic block so I had it continue on rather than return.

@tylanphear
Copy link
Contributor Author

The issue in SLP was fixed by @alexey-bataev in 5fe91f1 -- as discussed above, I don't think this PR needs to be pursued any further. Thanks @majnemer for review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Calling BasicBlock::getFirstNonPHIOrDbgOrAlloca on a catchswitch block yields an invalid (end) iterator.
3 participants